45 research outputs found
Flexible Tweedie regression models for continuous data
Tweedie regression models provide a flexible family of distributions to deal
with non-negative highly right-skewed data as well as symmetric and heavy
tailed data and can handle continuous data with probability mass at zero. The
estimation and inference of Tweedie regression models based on the maximum
likelihood method are challenged by the presence of an infinity sum in the
probability function and non-trivial restrictions on the power parameter space.
In this paper, we propose two approaches for fitting Tweedie regression models,
namely, quasi- and pseudo-likelihood. We discuss the asymptotic properties of
the two approaches and perform simulation studies to compare our methods with
the maximum likelihood method. In particular, we show that the quasi-likelihood
method provides asymptotically efficient estimation for regression parameters.
The computational implementation of the alternative methods is faster and
easier than the orthodox maximum likelihood, relying on a simple Newton scoring
algorithm. Simulation studies showed that the quasi- and pseudo-likelihood
approaches present estimates, standard errors and coverage rates similar to the
maximum likelihood method. Furthermore, the second-moment assumptions required
by the quasi- and pseudo-likelihood methods enables us to extend the Tweedie
regression models to the class of quasi-Tweedie regression models in the
Wedderburn's style. Moreover, it allows to eliminate the non-trivial
restriction on the power parameter space, and thus provides a flexible
regression model to deal with continuous data. We provide \texttt{R}
implementation and illustrate the application of Tweedie regression models
using three data sets.Comment: 34 pages, 8 figure
Associated kernel discriminant analysis for multivariate mixed data
Associated kernels have been introduced to improve the classical (symmetric) continuous kernels for smoothing any functional on several kinds of supports such as bounded continuous and discrete sets. In this paper, an associated kernel for discriminant analysis with multivariate mixed variables is proposed. These variables are of three types: continuous, categorical andcount. The method consists of using a product of adapted univariate associated kernels and an estimate of the misclassication rate. A new prole version cross-validation procedure of bandwidth matrices selection is introduced for multivariate mixed data, while a classical cross-validation is used for homogeneous data sets having the same reference measures. Simulations and validation results show the relevance of the proposed method. The method has been validated on real coronary heart disease data in comparison to the classical kernel discriminant analysis
Asymptotic properties of the normalized discrete associated-kernel estimator for probability mass function
Discrete kernel smoothing is now gaining importance in nonparametric
statistics. In this paper, we investigate some asymptotic properties of the
normalized discrete associated-kernel estimator of a probability mass function.
We show, under some regularity and non-restrictive assumptions on the
associated-kernel, that the normalizing random variable converges in mean
square to 1. We then derive the consistency and the asymptotic normality of the
proposed estimator. Various families of discrete kernels already exhibited
satisfy the conditions, including the refined CoM-Poisson which is
underdispersed and of second-order. Finally, the first-order binomial kernel is
discussed and, surprisingly, its normalized estimator has a suitable asymptotic
behaviour through simulations.Comment: 20 pages, 3 figure
On semiparametric regression for count explanatory variables
International audienceWe study the problem of semiparametric estimation of a multivariate count regression function m : Nd -> R that can be represented as a product of an unknown discrete parametric function r and an unknown discrete smooth function w. For the construction of such estimators, we first find an approximation result br for the parametric part r, and then estimate the nonparametric multiplicative correction factor w = m/br by a discrete associated-kernel method. Comparisons are therefore carried out with the nonparametric count regression estimator of Nadaraya-Watson type. We point out that the new semiparametric count regression estimator can reduce the bias with respect to purely nonparametric count regression estimator, without affecting the variance
Some discrete exponential dispersion models : poisson-tweedie and hinde-demétrio classes
In this paper we investigate two classes of exponential dispersion models (EDMs) for overdispersed count data with respect to the Poisson distribution. The first is a class of Poisson mixture with positive Tweedie mixing distributions. As an approximation (in terms of unit variance function) of the first, the second is a new class of EDMs characterized by their unit variance functions of the form ” + ”p, where p is a real index related to a precise model. These two classes provide some alternatives to the negative binomial distribution (p = 2) which is classically used in the framework of regression models for count data when overdispersion results in a lack of fit of the Poisson regression model. Some properties are then studied and the practical usefulness is also discussed
On d-pseudo-orthogonality of the Sheffer systems associated to a convolution semigroup
We investigate which Sheffer polynomials can be associated to a convolution semigroup of probability measures, usually induced by a stochastic process with stationary and independent increments. From a recent notion of d-pseudo-orthogonality (d ? {2, 3, ? ? ?}), we characterize the associated d-pseudo-orthogonal polynomials by the class of generating probability measures, which belongs to the natural exponential family with polynomial variance functions of exact degree 2d - 1. This extends some results of (classical) orthogonality; in particular, some new sets of martingales are then pointed out. For each integer d = 2 we completely illustrate polynomials with (2d-1)-term recurrence relation for the families of positive stable processes